WILLOW - 2018 - Annual activity report

WILLOW

WILLOW - 2018

Project-Team Willow

Team, Visitors, External Collaborators

Overall Objectives

Statement

Research Program

Application Domains

Highlights of the Year

New Software and Platforms

New Results

Bilateral Contracts and Grants with Industry

Partnerships and Cooperations

Dissemination

Bibliography

Publications of the year

Previous |

Home | Next next

Section: New Software and Platforms

Mixture-of-Embedding-Experts

Keyword: Computer vision

Functional Description: Joint understanding of video and language is an active research area with many applications. Prior work in this domain typically relies on learning text-video embeddings. One difficulty with this approach, however, is the lack of large-scale annotated video-caption datasets for training. To address this issue, we aim at learning text-video embeddings from heterogeneous data sources. To this end, we propose a Mixture-of-Embedding-Experts (MEE) model with ability to handle missing input modalities during training. As a result, our framework can learn improved text-video embeddings simultaneously from image and video datasets. We also show the generalization of MEE to other input modalities such as face descriptors.

Participants: Ivan Laptev and Josef Sivic
Contact: Antoine Miech
Publication: Learning a Text-Video Embedding from Incomplete and Heterogeneous Data
URL: https://www.di.ens.fr/willow/research/mee/

Previous |

Home | Next next